In [28]:
## Importing Ipython and other libraries needed for plotting and manipulation
import IPython
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# import seaborn as sb (What is seaborn?)
% matplotlib inline
In [16]:
# read anscombe.csv into data
# hint: what library from the above is used for importing data into a data frame? how do you *read* a .csv file?
data = '''do something here'''
In [ ]:
### Calculate and print mean of y1, y2, y3 and y4
### Hint: What library handles the numerical computation for data analysis?
In [ ]:
## Hint 2 : print np.mean(data.y1)
In [19]:
# Calculate and print variance value of y1, y2, y3 and y4
# Hint: numpy
In [ ]:
In [20]:
# Calculate and print mean of x1, x2, x3 and x4
In [21]:
In [24]:
# Calculate and print variance of x1, x2, x3 and x4
In [ ]:
In [25]:
# Calculate the covariance between each pair of x and y series
In [ ]:
Not so fast! Let's plot the data first. For each pair of x and y series, plot a scatter plot between x and y. Matplotlib, the library we imported as plt provides flexible and easy plotting capability. Lets draw scatter plots of y vs x using matplotlib.The first one has been done for you.
In [29]:
plt.scatter(data.x1,data.y1)
Out[29]:
In [ ]: